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A large amount of data is typically collected during a periodontal 
exam. Analyzing these data poses several challenges. Several types of 
measurements are taken at many locations throughout the mouth. 
These spatially-referenced data are a mix of binary and continuous 
responses, making joint modeling difficult. Also, most patients have 
missing teeth. Periodontal disease is a leading cause of tooth loss, 
so it is likely that the number and location of missing teeth informs 
about the patient's periodontal health. In this paper we develop a 
multivariate spatial framework for these data which jointly models 
the binary and continuous responses as a function of a single latent 
spatial process representing general periodontal health. We also use 
the latent spatial process to model the location of missing teeth. We 
show using simulated and real data that exploiting spatial associ- 
ations and jointly modeling the responses and locations of missing 
teeth mitigates the problems presented by these data. 

1. Introduction. Periodontal disease or periodontitis is an inflammatory 
disease affecting periodontium, the tissues that support and maintain teeth. 
Periodontitis causes progressive bone loss around the tooth which can lead to 
tooth loosening and eventually tooth loss. It has been estimated that about 
50% of US adults over the age of 35 experience early stages of periodontal 
disease [Oliver, Brown and Loe (1998)], making periodontitis the primary 
cause of adult tooth loss. To measure periodontal status, dental hygienists 
often use a periodontal probe to measure several disease markers throughout 
the mouth. Three of the most popular markers are (a) clinical attachment 
loss (CAL), (b) periodontal pocket depth (PPD) and (c) bleeding on probing 
(BOP). PPD and CAL are continuous variables, usually rounded to the 
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Fig. 1. Observed CAL for a typical patient. The shaded boxes represent teeth, the cir- 
cles represent measurement sites, and the gray lines represent neighbor pairs connecting 
adjacent sites on the same tooth and sites that share a gap between teeth. "Maxillary" and 
"Mandibular" refer to upper and lower jaws respectively. The small numbers beside each 
tooth are the "tooth numbers." The maxilla's second tooth on the left is missing; third 
molars ("wisdom teeth") are excluded. 

nearest millimeter. CAL is the distance down a tooth's root that is no longer 
attached to the surrounding bone by the periodontal ligament, and PPD is 
the distance from the gingival margin to the base of the pocket. BOP is 
a binary response and is indicative of whether a particular site bled with 
the application of a dental probe. During a full periodontal exam, all three 
markers are usually measured at six pre-specified sites [Darby and Walsh 
(1995)] for each tooth (excluding the third molars, i.e., the wisdom teeth). 
So for a patient with no missing teeth, there are S = 168 measurements for 
each marker (Figure 1). 

The motivating example is a clinical study conducted at the Medical 
University of South Carolina (MUSC) to determine the periodontal disease 
status for Type-2 diabetic Gullah-speaking African- Americans, originally 
presented in Fernandez et al. (2009). The objective of this analysis is to 
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quantify the disease status of this population, and to study the associations 
between disease status and patient-level covariates such as age, BMI, gender, 
HbAlC and smoking status. 

Quantifying a patient's disease status from the extensive data collected 
during a periodontal exam is difficult. For example, it is common to sum- 
marize disease status using the whole-mouth average CAL or the number of 
teeth with CAL above a certain threshold. Using the whole-mouth average 
CAL as the response in a regression with patient-level covariates is reason- 
able when the patients' residual distributions are identical. However, this 
assumption is often violated in practice, as different patients have different 
error variances, spatial covariances and missing data patterns. In this paper 
we present a multivariate spatial model to jointly analyze periodontal data 
from multiple markers and multiple measurement locations to improve es- 
timation of disease status, and hence develop a more powerful method for 
studying the association between patient-level covariates and periodontal 
disease. 

We use spatial factor analysis [Wang and Wall (2003), Hogan and Tchernis 
(2004), Lopes, Salazar and Gamerman (2008)] to model these multivariate 
spatial data. We postulate that the three markers are all related to a single 
latent spatial process (factor) measuring periodontal health. The latent peri- 
odontal health factor varies from site to site and is smoothed spatially using a 
conditionally autoregressive prior [Besag, York and Mollie (1991), Banerjee, 
Carlin and Gelfand (2004)]. The data collected for this study provide inter- 
esting challenges that require extensions of the spatial factor model. First, 
the data are a mix of continuous and binary responses. To jointly model 
these data types, we develop a spatial probit model for binary responses, 
which has the advantage of being fully-conjugate and leads to rapid MCMC 
sampling and convergence. Also, we have data from multiple patients, and 
exploratory analysis suggests that the covariance of the latent spatial factor 
varies by patient. Therefore, we develop a hierarchial model which allows the 
covariance to vary between patients, but pools information across patients to 
estimate the covariance parameters. We show in a simple case that in terms 
of estimating the effect of patient-level covariates, this model is equivalent 
to a weighted multiple regression, where the patient's scalar response is a 
linear combination of all data across location and marker, and the patient's 
weight decreases with the spatial correlation and variability. 

Another challenging aspect of analyzing periodontal data is the consider- 
able number of missing teeth (around 20% for these data). The assumption 
that teeth are missing completely at random is not valid because periodontal 
disease is the leading cause of adult tooth loss, so patients with several miss- 
ing teeth likely have poor periodontal health. For nonspatial data a common 
method to handle so-called "informative cluster size" is to include the num- 
ber of observations as a covariate, or in the weights of a weighted regression 
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[Hoffman, Sen and Weinberg (2001), Williamson, Datta and Satten (2003), 
Follman, Proschan and Leifer (2003), Lu (2005), Benhin, Rao and Scott 
(2005), Panageas et al. (2007), Cong, Yin and Shen (2007), Williamson et 
al. (2008)]. Dunson, Chen and Harry (2003) take a different approach. They 
propose a joint model for clustered mixed (continuous and binary) data and 
the number of responses in each cluster, using a continuation ratio pro- 
bit model for cluster size. Another approach is the shared parameter model 
[e.g., Wu and Carroll (1988), Follman and Wu (1995)]. The shared parameter 
model accounts for informative missingness by introducing random effects 
that are shared between the missing data process and the measurement pro- 
cess. Conditioned on the random effects, the missing data and measurement 
processes are assumed to be independent. 

We propose a shared variable model to jointly model missing teeth with 
the other markers of periodontal disease. However, in our spatial setting both 
the number and location of missing teeth are informative. For example, a 
missing tooth in the front of the mouth surrounded by teeth with low CAL 
may not be informative; in contrast, a missing tooth in the back of the mouth 
(where periodontal disease is often the most advanced) surrounded by teeth 
with high CAL is indicative of poor periodontal health in that region of the 
mouth. Therefore, we model the number and spatial distribution of missing 
teeth using our latent spatial factor model. In this model, CAL, PPD, BOP 
and the location of missing teeth are all modeled simultaneously in terms of a 
latent periodontal health factor; this approach uses all available information 
to estimate periodontal disease status. 

The paper proceeds as follows. Section 2 presents our unified approach 
to modeling multivariate spatially-referenced periodontal data, as well as 
our model for informatively missing teeth. Section 3 offers some influence 
diagnostics to determine which patients and response types are the most 
informative about the patient-level covariates. Computing details are given 
in Section 4. Section 5's simulation study shows that accounting for spatial 
association and informative observation location can lead to a substantial 
improvement in estimating the patient-level covariate effects. We analyze 
the periodontal data in Section 6. Section 7 concludes. 

2. Latent spatial factor model for periodontal data. In this section we 
describe our approach for spatially-referenced mixed periodontal data with 
informative missingness. We begin in Section 2.1 by specifying a latent spa- 
tial factor model assuming no missing teeth. Section 2.2 introduces the spa- 
tial probit model for missing teeth and Section 2.3 specifies priors and dis- 
cusses identifiability of the latent variable model. 

2.1. Complete data model. We assume our multivariate spatial data has 
J types of responses (for the periodontal data the J = 3 responses are CAL, 
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PPD and BOP) at each spatial location for each patient. If the jth type of 
response is continuous (CAL and PPD), let jjij(s) be the response at spatial 
location s for patient i, s = 1, . . . , S and i = 1, . . . ,N. Our data also has 
binary responses (BOP). If the jth type of response is binary, let y*j(s) be 
the response at spatial location s for patient i. We model binary responses 
using probit regression, that is, V*j{s) = I(yij(s) > 0), where /(•) is the binary 
indicator function and yij(s) is a Gaussian latent variable. 

All J responses are modeled as functions of the latent spatial disease 
status, Hi(s), which represents the overall periodontal health of patient % at 
location s. Let 

(1) yij(s) = aj + bjfii(s) + Sij(s), 

where ctj is the intercept for response j, bj relates the latent factor to re- 
sponse type j, and Sij(s) ~ N(0,afj) is error. As is customary for probit 
regression, we assume afj = 1 for binary responses for identification. Since 
all J responses depend on the common latent factor, they are correlated 
with 

(2) Corfe-M^M)- «*V«M.)1 



b) V4i(.)| + 4 ^ Var [«(„)] + al 

The slopes bj and bi determine the sign and magnitude of the correlation; if 
either bj or bi is zero, then yij(s) and ya(s) are uncorrelated, and if bj and 
bi share (do not share) the same sign, then yij(s) and yu(s) are positively 
(negatively) correlated. 

The latent vector ^ = (/ij(l), . . . ,fXi(S))' has a multivariate normal prior 
with conditionally autoregressive covariance ["CAR," Besag, York and Mollie 
(1991)]. The mean of /x^ is 

E(jii) = Wa + SUP, 

where W is an S x q matrix of spatial covariates (e.g., tooth number) that 
do not vary across patient, Xj is the p- vector of patient-level covariates (e.g., 
age) that do not vary across space within patient, J7j = X[ <g> Is, Is is the 
iS-vector of ones, and a and j3 are the corresponding regression parameters. 
The covariance of /ij is rfQ{pi)~ l , where Q(pi) = M — piD, D ss i is one if 
locations s and s' are adjacent and zero otherwise, M is diagonal with diag- 
onal elements M ss = D ss >. In this spatial model, p% € [0, 1] controls the 
degree of spatial association and if > controls the magnitude of variation. 
Let ri(s) = Pi(s) — E([j,i(s)). A convenient interpretation of the CAR prior is 
that the conditional distribution of rj(s) given rj(s') for all s' ^ s is normal 
with mean pifi(s) and variance jf /m(s), where fj(s) is the average of rj(s) 
at location s's m(s) neighbors. 
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The degree of spatial variation is allowed to differ between patients by 
means of afj, rf and pi. To pool information across patients, we use models 

(?ij 2 \cj , dj ~ Gamma(cj ,dj), 

(3) rr 2 | e ,/~Gamma(e,/), 

Pi\g,h~Bet&(g,h), 
where {cj}, {dj}, e, /, g and h are hyperparameters. 

2.2. Model for the location of missing teeth. For our data described in 
Section 1, roughly 20% of the teeth are missing. The locations of the missing 
teeth are not random, but rather related to the periodontal health in that 
region of the mouth. Therefore, we propose a model for the location of 
missing teeth as a function of the underlying latent factor p,i(s). 

For our data either the six observations on a tooth for all J responses are 
all observed or all unobserved. That is, if a tooth is missing, we have no data 
for the tooth, and if a tooth is not missing, we have complete data. Let y^t) 
be an indicator of whether tooth t = 1, . . . , T is missing for patient i. As with 
the binary data in Section 2, we model y* (t) using probit regression. Let 
Vio(t) = I(yio{t) > 0)) where yio(t) is a latent continuous variable. Then 

(4) yio(t) = a + b Z[fii + £io(t), 

where Z± is such that Z' t n i is the mean of ix i at the six observations on 

tooth t and e$o(£) iV(0, 1). ao and 6o relate the latent process to the 
missing tooth indicator. Note that since Pi{s) is included in both the model 
for presence of and value of the responses, both presence and value of the 
data contribute to the posterior of Pi(s), and thus the posterior of j3. Also 
note that 6jo = corresponds to independence between the latent factor and 
the location of missing teeth, in which case the location of missing teeth 
does not contribute to estimating (3. 

2.3. Identifiability and prior choice. Identifiability is a key issue in latent 
variable modeling. To see this, we inspect the first two moments of the 
multivariate response at location s for patient i after integrating over the 
latent factor 

E(y lJ (s)) = a J + b J [W(s)a + X' i f3}, 

(5) 

Cov{yij(8),yu(s)) = bjbm q(s) + I (J = l)a ip 

where W(s) is the row of W corresponding to location s and q(s) is the (s, s) 
diagonal element of Q{pi)~ l . Identifiability concerns arise in both moment 
expressions, as multiplying all of the slopes bj by scalar c and dividing a, 
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(3 and rf by c gives identical moments. Although there are other ways to 
address this issue, we fix b\ = 1. This identifies both the regression coeffi- 
cients a. and (3 via the mean of the first response and the CAR variance rf 
via the variance of the first response. In our analysis of periodontal data of 
Section 6 we take the first response with fixed slope to be clinical attachment 
loss, the most commonly used measure of periodontal disease. We also com- 
pare these results with other baseline assignments and discuss sensitivity to 
this assumption. 

The regression coefficients {dj}, {bj} (j ^ 1), a and (3 have independent 
N(0,w 2 ) priors. The hyperparameters {cj}, {dj}, e, f, g and h have inde- 
pendent Gamma(ti,v) priors. In the simulation study (Section 5) and data 
analysis (Section 6) we take u = v = 0.1 and w = 10 to give vague yet proper 
priors. We conduct a sensitivity analysis in Section 6 which shows that the 
results are not sensitive to these priors for this large periodontal data set. 

3. Influence diagnostics. Our primary interest is in the patient-level pa- 
rameters (3. In this complicated hierarchical Bayesian model, we would like 
to identify the sources of data that are most informative about (3. In this 
section we develop diagnostics to determine which patients, spatial locations 
and response types are the most influential. We assume no missing teeth, 
that all responses are Gaussian, and that no covariates depend on space (W 
is null). In this case the regression coefficients only affect the overall average 
response for each patient, and a tempting simplification is to collapse data 
over space and use each patient's overall average as a scalar response. We 
show that even in this case different areas of the mouth are more than less 
informative, and that patients are weighted differently depending on their 
spatial covariance parameters. This motivates the hierarchical spatial model 
even in this simple case. 

Integrating over latent effect fi { , but conditioning on er?-, rf, P i, a.j and 
bj, the posterior for (3 is Gaussian with 



COV(/3)= ^>pqx^ , 



(6) 



N \ - 1 N 



,i=l / i=l 



where 

(7) Wi = rr 2 l'[Q( Pl ) - Q( Pl )(6 i I s + Q{ Pi )T l Q{ Pi )\l, 

— l'Q(pi)(8iI s + Q(pi)) 1 ^2b j a ij 2 (y ij - aj). 



Zi 

Wi 
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(a) Spatial weights k(s) (b) Patient weights Wi 

Fig. 2. Panel (a) plots the spatial weights k(s) for various S and p. "Maxillary" and 
"Mandibular" refer to upper and lower jaws respectively, while "buccal" and "Ungual" refer 
to the cheek and the tongue sides of the teeth, respectively. The thin lines have p = 0.1, the 
wide lines have p = 0.99; the solid lines have r5 = 0.2, the dashed lines have 5 = 5. Panel 
(b) plots the patient weights u>i for various Si, n and pi. 



The posterior in (6) is equivalent to a weighted linear regression where each 
patient contributes the scalar response Zi and is weighted according to W{. 
Analyzing Zj and Wi shows which sites, patients and outcomes contribute 
the most to /3's posterior. 
First we consider zf 

1 3 b 

Zl = —i'[Q( Pt )(5 t i s + Q(pi))- 1 ] Y,^r(yij - a j) 

3=1 °« 

■ s 

^2h(s)(yij(s) - aj) , 

,s=l 

where the vector ki = l'[Q(pi)(5iI s + Q{pi))~ l ]/wi. Therefore, Z\ is a linear 
combination of all the observations for patient i, with k(s) controlling the 
relative weight of observations at location s and bj/afj controlling the rela- 
tive weight of response type j. Figure 2(a) plots k(s) (scaled to sum to S) 
for four combinations of pi and 8%. Observations in the gaps between teeth 
have the highest weight; these sites have the most neighbors and thus the 
smallest prior variance. Observations in the back of the mouth and on the 
sides of teeth get less weight. 



^ a 2 - 

7 = 1 l 3 
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The patient weights Wi are plotted as a function of pi, Si and r, in Fig- 
ure 2(b). The weight decreases with pi and rf, and increases with Si (in- 
versely related to error variances crfj). That is, patients with little spatial 
association and small variances rf and cjf- (and thus large Si) have the 
most influence on /3's posterior. To search for overly-influential patients, we 
compute the weights by evaluating (7) using posterior means pi, ff and 
a\. However, the marginal posterior for (3 is not available in closed- form for 
Section 6's data with informative missing teeth and binary responses. There- 
fore, we use only the CAL error variance that is, Si = ff /af x (b\ = 1, 
Section 2.3), as an approximation. This approximation is not meant to be 
definitive, but rather a useful heuristic device. 



4. MCMC sampling algorithm. MCMC sampling is carried out using the 
free software R (http://www.r-project.org/), although it would also be 
straightforward to implement the model using WinBUGS 
(http://www.mrc-bsu.cam.ac.uk/bugs/). Sample code to analyze a sin- 
gle continuous response is available in the supplemental article [Reich and 
Bandyopadhyay (2009)]. We draw 20,000 MCMC samples and discard the 
first 5000 as burn-in. Convergence is monitored using trace plots of the de- 
viance as well as several representative parameters. 

The patient-specific parameters are conditionally-conjugate except for the 
CAR spatial association parameters pi, which are updated using Metropolis- 
Hastings sampling with a Beta(50/9* , 50(1 — p*)) candidate distribution, where 
p* is the value at the previous iteration. The remaining parameters are 
updated using Gibbs sampling with full conditionals given below. The la- 
tent continuous variables corresponding to the probit model for binary re- 
sponses, yij(s), are updated from their truncated full conditionals yij(s) ~ 
N(a,j + bjpi(s), 1), restricted to (-oo,0) if y*j(s) = and (0,oo) if y*j(s) = 1. 
The vector of latent effects for patient i, fii, is multivariate normal with 

rest)" 1 = Z'Z&g + Q{ Pi )/rf + ( £ b)j 
E{\ii | rest) = V(Hi\ rest) b Z'(y i ~ a Q ) 




+ Q(pi)(Wa + tU(3)/T? + ^2b J (y ij - a j )/af j 



where Z = (Z 1 ,...,Z T ), = (y ij (l),...,y ij (S)) and y i0 = (y i0 (l),..., 
Uio(T))- The measurement error variances for the continuous responses have 
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full conditional 

afj | rest ~ InvGamma 5/2 + cj , (y%j(s) — a,j — bjHi (s)) /2 + dj , 



(9) 



r 2 | rest ~ InvGamma(5/2 + e, r^Q(/9j)rj/2 + /), 



where rj = fa — Wa — O.i/3. 

The intercept /slope pairs (aj,bj) have bivariate normal full conditionals 
with mean 



N 



V((aj,bj)'\ Testy 1 = w~ 2 I 2 + K^i/vfj, 



i=l 

/ iV 



E((a j ,6 i )'|rest) = bj)'\ rest) ^Ajyy/a, 



i=i 



where A, = (1,/ij). The regression coefficients a and /3 have multivariate 
normal full conditionals with 



N 



fairest)" 1 =w~%. +'£ / W'Q(p i )W/T?, 

i=i 

JV 

E(a\ rest) = y(/3 s | rest)X' s ^Q( Pi )(ft - ^/3)/r, 



2 

i=l 



and 



IV 



V(/3| rest)" 1 = w~ 2 I p + £ ^Q^)^/^, 



i=i 

JV 



E(/3| rest) = y(/3| rest) ^^Q( Pi )( Mi - f7a)/r? 

i=i 



The remaining parameters {cj}, {dj}, e, f and 5 are updated using Metropo- 
lis sampling with Gaussian candidate distributions tuned to give acceptance 
ratios near 0.40. 



5. Simulation study. In this section we conduct a simulation study to 
demonstrate the effects of spatial correlation and informative missingness 
on the analysis of patient-level fixed effects. For computational purposes we 
assume only one quadrant (i.e., half jaw) for each patient leaving S = 42, 
that there are no spatial covariates W, and the same CAR spatial association 
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parameter for each patient, that is, pi = p. We also assume there is only a 
single continuous response. Data are generated from the model 

P(Vi(s) = observed) = 1 - $(a + b pi(s)), 

(10) 

yi(s)\yi{s) observed ~ N(m + b 1 p i (s),a i ), 

where \i i ~ iV([a^/3]l5, T 2 Q~ 1 (p)). Each simulated data set contains data 
generated from this model for N = 50 patients. The p = 6 patient-level 
covariates Xj are generated independently from the standard normal dis- 
tribution and the regression coefficients are (3 = (0, 0, 0, 1, 2, 3)/20. Finally, 
a i = b\ = 1 and ao = — 1. 

M = 100 data sets are generated from each of six designs specified by 
varying the true value of the covariance parameters of, t 2 and p and the 
missing data mechanism 6q: 



• 


Design 1 


9 = 


0.0, 


h 


= and of 


= r? = 


1, 




• 


Design 2 


9 = 


0.9, 


b 


= and of 


= T? = 


1, 




• 


Design 3 


9 = 


0.9, 


bo 


= and of 


= T? = 


1.5*I(i 


is odd) + 0.5 


• 


Design 4 


9 = 


0.9, 


bo 


= 1 and of 


= T? = 


1, 




• 


Design 5 


9 = 


0.9, 


bo 


= 1 and of 


= T? = 


1.5*I(i 


is odd) + 0.5 


• 


Design 6 


9 = 


0.5, 


bo 


= 1 and of 


= rf = 


1.5*I(i 


is odd) + 0.5 



Observations within patients are independent under the first design and 
spatially correlated under all other designs. The variances are the same for 
all patients under Design 2 and vary across patients for Design 3. Designs 4 
and 5 are similar to Designs 2 and 3, except that the locations of missing 
observations are informative with bo = 1. Design 6 is the same as Design 5, 
except with moderate spatial association p = 0.5. 

We analyze each simulated data set using five models: 

• Model 1: Linear regression, y, L = J2seSi Vi{ s )/\^i\ ~ N {x^fi , a 2 ) , 

• Model 2: Section 2's spatial model without informative missingness or 
patient-specific variances, that is, &o = 0, of = a 2 and t 2 = r 2 , 

• Model 3: Section 2's spatial model with patient-specific variances but 
without informative missingness, that is, &o = 0, 

• Model 4 '■ Section 2's spatial model with informative missingness but with- 
out patient-specific variances, that is, of = a 2 and t 2 = r 2 , 

• Model 5: Section 2's full spatial model, 

where Si in Model 1 is the set of locations of observed data for patient 
i. Model 1 ignores spatial associations and missing teeth, and simply uses 
each patient's average observed response in a multiple regression. Models 2-5 
explicitly model all observations individually and account for spatial associ- 
ations between nearby observations. 
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Table 1 

Simulation study results. Column labels "bo "- "/3e " give the proportion of 95% intervals 
that exclude zero. The Monte Carlo standard errors (not shown) are between 0.007 and 
0.045 for 100*MSE and between 0.003 and 0.006 for the bias 
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0.81 


0.297 


0.043 


O 


1 
1 




u.uo 


U.U / 


U.U4 


u. 1Z 


U.ol 


U.Ol 


U.DD / 


U.U t t 




n 
A 




U.U4 


U.Uo 


U.Uo 


U.lo 


U.ol 


U.oz 


U.000 


U.U / z 




3 




0.09 


0.09 


0.07 


0.36 


0.69 


0.95 


0.181 


0.027 




4 


0.08 


0.03 


0.08 


0.09 


0.12 


0.31 


0.56 


0.653 


0.077 




5 


0.08 


0.09 


0.12 


0.08 


0.39 


0.68 


0.96 


0.178 


0.034 


4 


1 




0.04 


0.08 


0.05 


0.14 


0.43 


0.70 


0.266 


-0.150 




2 




0.06 


0.05 


0.07 


0.16 


0.43 


0.76 


0.267 


-0.146 




3 




0.04 


0.06 


0.06 


0.18 


0.42 


0.72 


0.265 


-0.141 




4 


1.00 


0.04 


0.10 


0.04 


0.18 


0.58 


0.89 


0.278 


0.048 




5 


1.00 


0.02 


0.06 


0.05 


0.19 


0.58 


0.86 


0.267 


0.026 


5 


1 




0.05 


0.04 


0.08 


0.07 


0.19 


0.26 


0.780 


-0.229 




2 




0.11 


0.07 


0.12 


0.15 


0.26 


0.34 


0.725 


-0.217 




3 




0.12 


0.11 


0.18 


0.31 


0.67 


0.89 


0.221 


-0.075 




4 


1.00 


0.06 


0.10 


0.09 


0.16 


0.46 


0.71 


0.693 


0.187 




5 


1.00 


0.06 


0.09 


0.08 


0.29 


0.66 


0.94 


0.193 


0.023 


6 


1 




0.06 


0.04 


0.07 


0.10 


0.28 


0.47 


0.409 


-0.200 




2 




0.16 


0.11 


0.12 


0.18 


0.43 


0.66 


0.383 


-0.191 




3 




0.07 


0.09 


0.11 


0.45 


0.89 


1.00 


0.086 


-0.062 




4 


1.00 


0.09 


0.07 


0.11 


0.30 


0.76 


0.97 


0.279 


0.130 




5 


1.00 


0.06 


0.07 


0.08 


0.61 


0.96 


1.00 


0.070 


0.024 



The results are presented in Table 1. For each model and each design, 
we calculate the proportion of the 95% posterior intervals for bo and the 
regression coefficients that exclude zero. We also compute the mean squared 
error and relative bias, MSE = 7^7 X^m=i Y^=i0j ~ Pj) 2 an d RelBiasj = 

~M zCm=i(<^"^ — Pj)/fij-> where is the posterior mean of /3j for the mth 
simulated data set and [3j is the true value. Relative bias is only presented 
for the largest coefficient, (3q. 
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Data for the first design are generated without spatial association or infor- 
mative missingness. In this case all five models give nearly identical results, 
demonstrating that the spatial models are able to approximate the simple 
regression model if appropriate. The five models are also nearly identical for 
Design 2 where the data are generated with spatial correlation and the same 
variances for each patient. In this case the patient means are Gaussian 
with mean a\ + and the same variances, satisfying the usual regression 
assumptions. 

The linear regression model does not perform well for Design 3's spatial 
model with patient-dependent variances. In this case the patient means yi 
have different variances, violating the usual regression assumptions. The spa- 
tial models that allow for patient-dependent variances (Models 3 and 5) give 
dramatic improvements in both power and mean squared error compared to 
the homoskedastic models. 

The locations of missing observations are informative for Designs 4, 5 
and 6. For these two designs the models (Models 1-3) that do not account 
for informative missingness are biased for f3§. The models that allow for 
informative location consistently identify bo as nonzero (power 1.0 in all three 
designs), which alleviates the bias for the nonnull predictors and improves 
power. Design 5 has both informative missingness and patient-dependent 
variances, common traits of periodontal data. In this case our full model is 
more than three times more powerful for (5q (0.26 to 0.94) and has roughly 
one fourth the mean squared error (0.193 to 0.780) of the usual nonspatial 
regression approach. 

6. Analysis of periodontal data. The motivating data were collected 
from a clinical study [Fernandes et al. (2009)] conducted by the Center 
for Oral Health Research (COHR) at the Medical University of South Car- 
olina (MUSC). The relationship between periodontal disease and diabetes 
level has been previously studied in the dental literature [Faria-Almeida, 
Navarro and Bascones (2006), Taylor and Borgnakke (2008)]. The objective 
of this study was to explore the relationship between periodontal disease 
and diabetes level (determined by the popular marker HbAlc, or "glycosy- 
lated hemoglobin") in the Type-2 diabetic adult (13 years or older) Gullah- 
speaking African- American population residing in the coastal sea-islands of 
South Carolina. Since this is part of an ongoing study, we selected 199 pa- 
tients with complete covariate information and with at least 50% responses 
available. 

For each patient CAL, PPD and BOP are measured at six locations on 
each nonmissing tooth, as shown in Figure 1. Additionally, several patient- 
level covariates were obtained, including age (in years), gender (1 = Female, 
= Male), body mass index or BMI (in kg/m 2 ), smoking status (1 = a 
smoker, = never) and HbAlc (1 = High, = controlled). We also include 
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spatial covariates for the site in the gap between teeth (1 = in the gap, 
= on the side of a tooth), jaw (1 = maxilla, = mandible) and six tooth 
number indictors with the first tooth (front of the mouth, Figure 1) serving 
as the reference tooth. All covariates are standardized to have mean zero 
and variance one. The spatial adjacency structure is shown in Figure 1; we 
consider neighboring sites on the same tooth as well as neighboring sites on 
the consecutive teeth to be adjacent. 

We begin by fitting several models with the same variances for all pa- 
tients, that is, afj = crj, rf = r 2 and p 2 = p 2 . We fit four models by assuming 
spatial association (p ~ Unif [0, 1]) and independence (p = 0), and assuming 
missing teeth are informative (bo ^ 0) and not informative (&o = 0). Table 2 
gives posterior 95% intervals for several parameters. The slopes bj for pocket 
depth and bleeding on probing (as described in Section 2.3, slope for attach- 

Table 2 

Posterior 95% intervals for models assuming variances afj and rf are constant across 
patients. "Spatial" models take p^=0 and models with informative missing teeth ("Info 

missing") have bo ^ 



Spatial 


No 




No 


Yes 


Yes 


Info missing 


No 




Yes 


No 


Yes 


Age 


(-0.002, 0.022) 




(0.009, 0.033) 


(0.000, 0.079) 


(0.036, 0.115) 


Female 


(-0.129, -0.104) 


(- 


-0.128, -0.103) 


(-0.181, -0.103) 


(-0.173, -0.096) 


BMI 


(-0.016, 0.007) 


( 


-0.014, 0.011) 


(-0.048, 0.030) 


(-0.033, 0.046) 


Smoker 


(0.028, 0.051) 




(0.028, 0.051) 


(0.014, 0.091) 


(0.010, 0.088) 


Hbalc 


(0.114, 0.139) 




(0.118, 0.143) 


(0.123, 0.199) 


(0.128, 0.207) 


ao: missing 




(- 


-1.390, -1.239) 




(-1.349, -1.172) 


oi: CAL 


(1.008, 1.052) 




(1.021, 1.087) 


(0.993, 1.112) 


(1.002, 1.139) 


a 2 : PPD 


(1.015, 1.055) 




(1.034, 1.101) 


(1.092, 1.214) 


(1.104, 1.139) 


a 3 : BOP 


(-0.369, -0.323) 


(- 


-0.359, -0.312) 


(-0.399, -0.327) 


(-0.394, -0.309) 


bo: missing 






(0.432, 0.513) 




(0.434, 0.544) 


b 2 : PPD 


(1.144, 1.178) 




(1.131, 1.160) 


(1.021, 1.047) 


(1.014, 1.043) 


6 3 : BOP 


(0.473, 0.510) 




(0.475, 0.510) 


(0.510, 0.547) 


(0.434, 0.544) 


P 








(0.972, 0.978) 


(0.972, 0.978) 


T 


(1.454, 1.499) 




(1.464, 1.508) 


(0.832, 0.870) 


(0.838, 0.874) 


o-i : CAL 


(0.942, 0.961) 




(0.972, 0.978) 


(0.881, 0.900) 


(0.935, 0.953) 


cr 2 : PPD 


(0.106, 0.182) 




(0.177, 0.219) 


(0.454, 0.486) 


(0.464, 0.493) 


Tooth 2 


(-0.027, 0.038) 


( 


-0.038, 0.034) 


(-0.063, 0.027) 


(-0.072, 0.022) 


Tooth 3 


(0.037, 0.101) 




(0.022, 0.091) 


(-0.007, 0.106) 


(-0.025, 0.100) 


Tooth 4 


(0.174, 0.241) 




(0.179, 0.253) 


(0.106, 0.242) 


(0.138, 0.275) 


Tooth 5 


(0.237, 0.306) 




(0.263, 0.339) 


(0.214, 0.369) 


(0.296, 0.451) 


Tooth 6 


(0.751, 0.825) 




(0.849, 0.932) 


(0.597, 0.773) 


(0.853, 1.037) 


Tooth 7 


(0.866, 0.954) 




(0.955, 1.048) 


(0.597, 0.773) 


(0.986, 1.151) 


Gap 


(0.953, 1.001) 




(0.938, 0.994) 


(0.992, 1.030) 


(0.986, 1.022) 


Maxilla 


(-0.289, -0.246) 


(- 


-0.293, -0.247) 


(-0.376, -0.235) 


(-0.363, -0.211) 
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ment loss is fixed at one) are significantly positive for all models, suggesting 
strong positive associations between the three responses. Several covariates 
are significant for all models, including patient effects gender, smoking sta- 
tus and HbAlc status, indicators of a site in the gap between teeth and a 
site on the upper jaw, and several tooth number indictors with sites in the 
back of the mouth having higher mean responses. 

The slope relating the latent spatial process with the probability of a miss- 
ing tooth, bo, is also significantly positive. This matches the intuition that 
patients with poor periodontal health generally have more missing teeth. 
Figure 3 plots the data and fitted values for a typical patient to illustrate the 
effects of accounting for informative missing teeth. This plot compares the 
spatial models with 60 set to zero (solid lines) and bo not set to zero (dashed 
lines). The posterior means [Figure 3(a)-(c)] and credible sets [3(d)] are 
nearly identical for observations on nonmissing teeth. However, for missing 
teeth the fitted values for all three responses are larger (worse periodontal 
health) when accounting for informative observation location. 

Accounting for informative observation location also affects the patient 
effect for age. The 95% interval ignoring spatial association and informative 
observations location is (—0.002, 0.022), compared to (0.036, 0.115) for the 
full model. The measures of periodontal disease are cumulative, so it seems 
reasonable that age should be an important predictor. Our data show a 
relationship between age and the number of missing teeth; patients that 
are younger than 54 (the mean age) have an average of 135.8 (sd = 20.1) 
observations and patients that are older than 54 have an average of 124.7 
(sd = 22.2) observations. By accounting for this relationship, we identify age 
as a significant predictor of periodontal health. 

Section 5's simulation study shows that the fixed effects can also be af- 
fected if patients have different spatial covariances. To explore this possibility 
for our periodontal data, we apply Section 2's model with variances afj and 
if varying across patients. Figure 4(a) and (b) summarize the posteriors of 
the variance parameters. Here we see considerable variation across patients; 
Figure 4(b) shows that the posterior 95% intervals for r% are nonoverlapping 
for patients with small and large Tj. 

Section 3's Wi diagnostic in (7) indicates which patients are the most 
influential on the regression coefficients. The lOj (computed using only the 
CAL error variance) have median 28.0 and vary greatly across patients with 
95% interval (5.8, 61.7). Figure 4(c) and (d) plot CAL for the patients with 
the smallest and largest w%. The responses for the patient with smallest W{ 
vary considerably from site-to-site within the mouth, with attachment loss 
ranging from to 11 mm. Information about the patient-level covariates 
accumulate via the mean of the latent parameters fa; due to spatial vari- 
ability, the mean is quite uncertain for this patient and, thus, this patient 
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Fig. 3. Panels (a)-(c) plot the data (dots) and posterior mean of the expected response 
(lines) for a typical patient. Panel (d) plots the posterior mean (bold) and 95% interval 
(thin) for the latent spatial process fi(s). All plots include results for both the model with 
(dashed line) and without (solid line) informative observation location. "Maxillary" and 
"Mandibular" refer to upper and lower jaws respectively, while "buccal" and "lingual" refer 
to the cheek and the tongue sides of the teeth, respectively. 



provides little information about (3. In contrast, the patient with largest Wi 
is stable from site-to-site, providing reliable information the mean of and 
thus about f3. 
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(c) CAL (mm) for the patient with 10,; = 3.0 (d) CAL (mm) for the patient with w t = 74.1 

Fig. 4. Panel (a) gives the posterior medians of the patient- specific standard deviations 
(ji,Oij) for attachment loss, and panel (b) plots the posterior of the CAR standard devi- 
ations Ti (the horizontal lines in the ith column are the posterior 0.025, 0.25, 0.5, 0.75 
and 0.975 quantiles for Ti). Panels (c) and (d) plot the attachment loss for the patients 
with smallest and largest weights Wi, respectively. "Maxillary" and "Mandibular" refer to 
upper and lower jaws, respectively, while "buccal" and "lingual" refer to the cheek and the 
tongue sides of the teeth, respectively. 



Table 3 gives the 95% posterior intervals for several parameters from 
the model with patient-dependent variances. Comparing the spatial models 
with informative missingess, the results for the patient-level covariates are 
fairly similar for the models with and without patient-dependent variances 
(i.e., the final columns of Tables 2 and 3). However, we note that the width 
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Table 3 

Posterior 95% intervals for models assuming variances afj and rf vary across patients. 
"Spatial" models take pj^O and models with informative missing teeth ("Info missing") 



have bo ^= 


Spatial 


No 


No 


Yes 


Yes 


Info missing 


No 


Yes 


No 


Yes 




(— 019 007"! 

I U.ulC/) \J .\J\J I I 


(— 01 3 01 31 


(— 002 0571 


(0 023 086~1 


Kprn alp 
_L llldlL 


c_n 1 1 4 — n 086*1 


i'_n 1 1 5 — n 087*1 




1 U.1UO) U. ±VJ~± 1 


BMI 




!— 006 0901 


!— 015 040"] 


c_n noq 050~i 


Smoker 


(0.038, 0.062) 


(0.037, 0.061) 


(0.021, 0.078) 


(0.019, 0.075) 


Hbalc 


(0.089, 0.114) 


(0.090, 0.118) 


(0.095, 0.155) 


(0.106, 0.171) 


ao: missing 




(-1.154, -1.004) 




(-1.201, -1.040) 


01: CAL 


(0.859, 0.921) 


(0.851, 0.930) 


(0.855, 0.942) 


(0.892, 0.989) 


a 2 : PPD 


(0.899, 0.960) 


(0.889, 0.966) 


(0.920, 1.012) 


(0.958, 1.058) 


a 3 : BOP 


(-0.425, -0.373) 


(-0.424, -0.370) 


(-0.482, -0.414) 


(-0.464, -0.394) 


60: missing 




(0.265, 0.378) 




(0.294, 0.410) 


6 2 : PPD 


(1.002, 1.014) 


(1.001, 1.014) 


(1.017, 1.036) 


(1.013, 1.031) 


63: BOP 


(0.462, 0.499) 


(0.463, 0.498) 


(0.518, 0.559) 


(0.521, 0.560) 


P 






(0.954, 0.962) 


(0.958, 0.965) 


Tooth 2 


(-0.051, 0.010) 


(-0.050, 0.019) 


(-0.054, 0.023) 


(-0.063, 0.016) 


Tooth 3 


(0.022, 0.081) 


(0.021, 0.086) 


(0.007, 0.105) 


(-0.009, 0.095) 


Tooth 4 


(0.183, 0.246) 


(0.190, 0.258) 


(0.138, 0.251) 


(0.141, 0.254) 


Tooth 5 


(0.307, 0.374) 


(0.318, 0.391) 


(0.267, 0.385) 


(0.291, 0.412) 


Tooth 6 


(0.776, 0.850) 


(0.825, 0.909) 


(0.634, 0.763) 


(0.753, 0.899) 


Tooth 7 


(0.854, 0.940) 


(0.902, 0.991) 


(0.630, 0.769) 


(0.782, 0.938) 


Gap 


(0.887, 0.929) 


(0.886, 0.932) 


(0.894, 0.926) 


(0.890, 0.922) 


Maxilla 


(-0.280, -0.241) 


(-0.278, -0.238) 


(-0.305, -0.208) 


(-0.301, -0.201) 



of the credible intervals are smaller for the model with patient-dependent 
variances. 

Finally, we conducted a sensitivity analysis to determine the effect of mod- 
eling assumptions for the full spatial model with patient-dependent variances 
and informative observation location. We modified the analysis by changing 
the reference group with slope bj fixed to one from CAL to PPD and BOP, 
changing the hyperparameters u = v = 0.1 to u = v = 0.0001, and changing 
the hyperparameter w = 10 to w = 1000. The posterior 95% intervals are 
given in Table 4 for the patient effects, scaled by 6i for comparison across 
reference group. The modification with the largest effect is changing the 
reference group from CAL to BOP. The patient level effects are generally 
closer to zero using BOP as the reference group. Despite this change in scale, 
the signs of the coefficients and the subset of coefficients with intervals that 
exclude zero remains the same as the original analysis. 

Also, we consider modifying the adjacency structure shown by the gray 
lines in Figure 1 ("spatial grid 1") in two ways: first by not considering sites 
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Table 4 

Posterior 95% intervals for the patient effects for various modeling/prior choices for the 
full model with spatial correlation, patient dependent variance and informative 
observation location. "Ref group" refers to the response that has slope bj fixed to one, 
"u, v " and "w " are the hyperparameters for the covariance parameters and regression 
coefficients, respectively, as described in Section 2.3. To facilitate comparison across 
reference groups, the intervals for (3/bi are presented 



Ref group 


CAL 


PPD 


BOP 


CAL 


u, V 


0.1 


O.l 


O.l 


O.OOl 


w 


10 


10 


10 


10 


Spatial grid 


1 


1 


1 


1 


Age 


(0.023, 0.086) 


(0.025, 0.088) 


(0.007, 0.079) 


(0.039, 0.110) 


Female 


(-0.168, -0.104) 


(-0.175, -0.108) 


(-0.052, -0.031) 


(-0.169, -0.097) 


BMI 


(-0.009, 0.050) 


(-0.012, 0.052) 


(-0.003, 0.015) 


(-0.015, 0.054) 


Smoker 


(0.019, 0.075) 


(0.020, 0.077) 


(0.005, 0.023) 


(0.009, 0.075) 


Hbalc 


(0.106, 0.171) 


(0.110, 0.174) 


(0.033, 0.054) 


(0.122, 0.190) 


Ref group 


CAL 


CAL 


CAL 




u, V 


O.l 


O.l 


O.l 




w 


lOOO 


10 


10 




Spatial grid 


1 


2 


3 




Age 


(0.039, 0.109) 


(0.023, 0.068) 


(0.038, 0.105) 




Female 


(-0.174, -0.096) 


(-0.150, -0.103) 


(-0.167, -0.097) 




BMI 


(-0.017, 0.050) 


(-0.009, 0.035) 


(-0.011, 0.054) 




Smoker 


(0.009, 0.076) 


(0.024, 0.067) 


(0.010, 0.074) 




Hbalc 


(0.119, 0.191) 


(0.117, 0.164) 


(0.111, 0.176) 





on the opposite side of a tooth to be neighbors ("spatial grid 2") to give 
independent AR(1) models to the sites on the buccal and lingual sides of 
each jaw, and second by considering all pairs of observations on the same 
tooth to be neighbors ("spatial grid 3"). To determine how well each of 
these spatial grids fit our data, we use the deviance information criteria 
(DIC) of Spiegelhalter et al. (2002). To compare spatial structures using 
DIC, we analyze only a single continuous response, CAL, and do not consider 
informative missing teeth. DIC prefers grid 1 (DIC = 66,128) over grids 2 
(DIC = 68,168) and 3 (DIC = 68,562). Table 4 gives the posterior of the 
subject-level effects for the full data analysis using the three spatial grids; 
the results are not sensitive to the choice of spatial structure. 

7. Discussion. In this paper we develop a latent factor model for mul- 
tivariate spatial periodontal data with a mix of binary and continuous re- 
sponses. Our model allows for a different spatial covariance for each patient 
and for informative missing teeth. We show using simulated and real data 
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that accounting for these factors leads to a substantial improvement for 
estimating covariate effects compared to standard regression techniques. 

We have assumed throughout that the patient's periodontal health can be 
captured by a single latent factor. It would be straightforward, conceptually 
if not computationally, to include more latent factors. However, this leads 
to the problem of selecting the appropriate number of latent factors, inter- 
preting the roles of the different latent factors, and understanding the effects 
of the covariates on the different latent factors. For these data with three 
strongly-correlated responses we prefer the single factor model for compu- 
tational simplicity and interpretability. If multiple factors are allowed, the 
number of factors could be chosen using the deviance information criteria. 
Another approach would be to allow the number of factors to be unknown. 
Lopes, Salazar and Gamerman (2008) and Salazar, Gamerman and Lopes 
(2009) use reversible jump MCMC to account for uncertainty in the number 
of latent factors. Extending this approach to our setting may be complicated 
by the large number of subjects, since the proposal density would have to 
propose spatial models that simultaneously fit well for all 199 subjects. An- 
other possibility would be to extend the parameter expansion method of 
Ghosh and Dunson (2008) to the spatial setting. 

We have also assumed that the latent spatial process is Gaussian. For 
nonspatial data several authors have proposed methods that avoid assuming 
the shared random effects are Gaussian [Lin et al. (2000), Song, Davidian 
and Tsiatis (2002), Beunckens et al. (2008), Tsonaka, Verbeke and Lesaffre 
(2009)]. These approaches could be extended to the periodontal setting by 
replacing the Gaussian spatial model with a non-Gaussian spatial model 
[e.g., Gelfand, Kottas and MacEachern (2005), Griffin and Steel (2006), 
Reich and Fuentes (2007)]. 

An area of future work is to apply this method to longitudinal periodontal 
data. Periodontal data is often collected repeatedly for a single patient over 
time to monitor disease progression. Reich and Hodges (2008) propose a 
spatiotemporal model for attachment loss. It should be possible to extend 
this model to accommodate mixed multivariate responses and informative 
missing teeth. 
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SUPPLEMENTARY MATERIAL 

Computer code (spatial factor.R) (DOI: 10. 1214/09- AOAS278SUPP; .R). 
In the supplemental file, we include R code to analyze a single continuous 
response with informative missingness. Use of the code is described in the 
file and is illustrated with an analysis of a simulated data set. 
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